BABEL: an eastern european multi-language database
نویسندگان
چکیده
BABEL is a joint European project under the COPERNICUS scheme (Project #1304) comprising partners from five Eastern European countries and three Western ones. The project is producing a multi-language database of five of the most widelydiffering Eastern European languages. The collection and formatting of the data conforms to the protocols established by the ESPRIT SAM project and the resulting EUROM databases.
منابع مشابه
Language independent and unsupervised acoustic models for speech recognition and keyword spotting
Developing high-performance speech processing systems for low-resource languages is very challenging. One approach to address the lack of resources is to make use of data from multiple languages. A popular direction in recent years is to train a multi-language bottleneck DNN. Language dependent and/or multi-language (all training languages) Tandem acoustic models are then trained. This work con...
متن کامل2016 BUT Babel System: Multilingual BLSTM Acoustic Model with i-Vector Based Adaptation
The paper provides an analysis of BUT automatic speech recognition systems (ASR) built for the 2016 IARPA Babel evaluation. The IARPA Babel program concentrates on building ASR system for many low resource languages, where only a limited amount of transcribed speech is available for each language. In such scenario, we found essential to train the ASR systems in a multilingual fashion. In this w...
متن کاملSpeechDat(E) - Eastern European Telephone Speech Databases
This paper describes the creation of five new telephony speech databases for Central and Eastern European languages within the SpeechDat(E) project. The 5 languages concerned are Czech, Polish, Slovak, Hungarian, and Russian. The databases follow SpeechDat-II specifications with some language specific adaptation. The present paper describes the differences between SpeechDat(E) and earlier Speec...
متن کاملImplementation Tricks in the Hungarian Babel Module
magyar.ldf, the Hungarian Babel module, was rewritten in the autumn of 2003 to obey most of the Hungarian typographical rules. This article describes some implementation issues, TEX macro programming hacks, and LATEX typesetting trickery used in magyar.ldf. All features of the new magyar.ldf are enumerated, but only those having an interesting implementation are presented in detail. Most of the...
متن کاملBabel: An XML-Based Application Integration
One of the major problems in integrating independently developed applications is the divergence between the data and control-of-processing models assumed by these applications. Research on database integration has focused on establishing and maintaining a canonical schema on top of the schemas of the underlying databases. At the same time, web-accessible software systems have been adopting a mu...
متن کامل